Heap (data Structure)
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a heap is a specialized
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
-based data structure which is essentially an almost complete tree that satisfies the heap property: in a ''max heap'', for any given
node In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex). Node may refer to: In mathematics * Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two or more curves, lines ...
C, if P is a parent node of C, then the ''key'' (the ''value'') of P is greater than or equal to the key of C. In a ''min heap'', the key of P is less than or equal to the key of C. The node at the "top" of the heap (with no parents) is called the ''root'' node. The heap is one maximally efficient implementation of an
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
called a priority queue, and in fact, priority queues are often referred to as "heaps", regardless of how they may be implemented. In a heap, the highest (or lowest) priority element is always stored at the root. However, a heap is not a sorted structure; it can be regarded as being partially ordered. A heap is a useful data structure when it is necessary to repeatedly remove the object with the highest (or lowest) priority, or when insertions need to be interspersed with removals of the root node. A common implementation of a heap is the
binary heap A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort. A bin ...
, in which the tree is a binary tree (see figure). The heap data structure, specifically the binary heap, was introduced by
J. W. J. Williams John William Joseph Williams (2 September 1930 – 29 September 2012) was a Welsh-Canadian computer scientist best known for inventing in 1964 heapsort and the binary heap data structure. He was born in Chippenham, Wiltshire''England & Wales, Civ ...
in 1964, as a data structure for the
heapsort In computer science, heapsort is a comparison-based sorting algorithm. Heapsort can be thought of as an improved selection sort: like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the ...
sorting algorithm. Heaps are also crucial in several efficient
graph algorithms The following is a list of well-known algorithms along with one-line descriptions for each. Automated planning Combinatorial algorithms General combinatorial algorithms * Brent's algorithm: finds a cycle in function value iterations using on ...
such as
Dijkstra's algorithm Dijkstra's algorithm ( ) is an algorithm for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years ...
. When a heap is a complete binary tree, it has a smallest possible height—a heap with ''N'' nodes and ''a'' branches for each node always has log''a'' ''N'' height. Note that, as shown in the graphic, there is no implied ordering between siblings or cousins and no implied sequence for an
in-order traversal In computer science, tree traversal (also known as tree search and walking the tree) is a form of graph traversal and refers to the process of visiting (e.g. retrieving, updating, or deleting) each node in a tree data structure, exactly once. S ...
(as there would be in, e.g., a
binary search tree In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and ...
). The heap relation mentioned above applies only between nodes and their parents, grandparents, etc. The maximum number of children each node can have depends on the type of heap.


Operations

The common operations involving heaps are: ;Basic * ''find-max'' (or ''find-min''): find a maximum item of a max-heap, or a minimum item of a min-heap, respectively (a.k.a. ''
peek Polyether ether ketone (PEEK) is a colourless organic thermoplastic polymer in the polyaryletherketone (PAEK) family, used in engineering applications. The polymer was first developed in November 1978, later being introduced to the market by Vic ...
'') * ''insert'': adding a new key to the heap (a.k.a., ''push'') * ''extract-max'' (or ''extract-min''): returns the node of maximum value from a max heap r minimum value from a min heapafter removing it from the heap (a.k.a., ''pop'') * ''delete-max'' (or ''delete-min''): removing the root node of a max heap (or min heap), respectively * ''replace'': pop root and push a new key. More efficient than pop followed by push, since only need to balance once, not twice, and appropriate for fixed-size heaps. ;Creation * ''create-heap'': create an empty heap * ''heapify'': create a heap out of given array of elements * ''merge'' (''union''): joining two heaps to form a valid new heap containing all the elements of both, preserving the original heaps. * ''meld'': joining two heaps to form a valid new heap containing all the elements of both, destroying the original heaps. ;Inspection * ''size'': return the number of items in the heap. * ''is-empty'': return true if the heap is empty, false otherwise. ;Internal * ''increase-key'' or ''decrease-key'': updating a key within a max- or min-heap, respectively * ''delete'': delete an arbitrary node (followed by moving last node and sifting to maintain heap) * ''sift-up'': move a node up in the tree, as long as needed; used to restore heap condition after insertion. Called "sift" because node moves up the tree until it reaches the correct level, as in a
sieve A sieve, fine mesh strainer, or sift, is a device for separating wanted elements from unwanted material or for controlling the particle size distribution of a sample, using a screen such as a woven mesh or net or perforated sheet material. T ...
. * ''sift-down'': move a node down in the tree, similar to sift-up; used to restore heap condition after deletion or replacement.


Implementation

Heaps are usually implemented with an
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
, as follows: * Each element in the array represents a node of the heap, and * The parent / child relationship is defined implicitly by the elements' indices in the array. For a
binary heap A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort. A bin ...
, in the array, the first index contains the root element. The next two indices of the array contain the root's children. The next four indices contain the four children of the root's two child nodes, and so on. Therefore, given a node at index , its children are at indices and , and its parent is at index . This simple indexing scheme makes it efficient to move "up" or "down" the tree. Balancing a heap is done by sift-up or sift-down operations (swapping elements which are out of order). As we can build a heap from an array without requiring extra memory (for the nodes, for example),
heapsort In computer science, heapsort is a comparison-based sorting algorithm. Heapsort can be thought of as an improved selection sort: like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the ...
can be used to sort an array in-place. After an element is inserted into or deleted from a heap, the heap property may be violated, and the heap must be re-balanced by swapping elements within the array. Although different type of heaps implement the operations differently, the most common way is as follows: * Insertion: Add the new element at the end of the heap, in the first available free space. If this will violate the heap property, sift up the new element (''swim'' operation) until the heap property has been reestablished. * Extraction: Remove the root and insert the last element of the heap in the root. If this will violate the heap property, sift down the new root (''sink'' operation) to reestablish the heap property. * Replacement: Remove the root and put the ''new'' element in the root and sift down. When compared to extraction followed by insertion, this avoids a sift up step. Construction of a binary (or ''d''-ary) heap out of a given array of elements may be performed in linear time using the classic Floyd algorithm, with the worst-case number of comparisons equal to 2''N'' − 2''s''2(''N'') − ''e''2(''N'') (for a binary heap), where ''s''2(''N'') is the sum of all digits of the binary representation of ''N'' and ''e''2(''N'') is the exponent of 2 in the prime factorization of ''N''. This is faster than a sequence of consecutive insertions into an originally empty heap, which is log-linear.


Variants

* 2–3 heap * B-heap *
Beap A beap, or bi-parental heap, is a data structure where a node usually has two parents (unless it is the first or last on a level) and two children (unless it is on the last level). Unlike a heap, a beap allows sublinear search. The beap was intro ...
*
Binary heap A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort. A bin ...
*
Binomial heap In computer science, a binomial heap is a data structure that acts as a priority queue but also allows pairs of heaps to be merged. It is important as an implementation of the mergeable heap abstract data type (also called meldable heap), whi ...
*
Brodal queue In computer science, the Brodal queue is a heap/priority queue structure with very low worst case In computer science, best, worst, and average cases of a given algorithm express what the resource usage is ''at least'', ''at most'' and ''on aver ...
* ''d''-ary heap *
Fibonacci heap In computer science, a Fibonacci heap is a data structure for priority queue operations, consisting of a collection of heap-ordered trees. It has a better amortized running time than many other priority queue data structures including the binar ...
* K-D Heap * Leaf heap * Leftist heap *
Pairing heap A pairing heap is a type of heap data structure with relatively simple implementation and excellent practical amortized performance, introduced by Michael Fredman, Robert Sedgewick, Daniel Sleator, and Robert Tarjan in 1986. Pairing heaps are ...
*
Radix heap A radix heap is a data structure for realizing the operations of a monotone priority queue. A set of elements to which a key is assigned can then be managed. The run time of the operations depends on the difference between the largest and smalles ...
* Randomized meldable heap *
Skew heap A skew heap (or self-adjusting heap) is a heap data structure implemented as a binary tree. Skew heaps are advantageous because of their ability to merge more quickly than binary heaps. In contrast with binary heaps, there are no structural constr ...
*
Soft heap In computer science, a soft heap is a variant on the simple heap data structure that has constant amortized time complexity for 5 types of operations. This is achieved by carefully "corrupting" (increasing) the keys of at most a constant number ...
*
Ternary heap Ternary (from Latin ''ternarius'') or trinary is an adjective meaning "composed of three items". It can refer to: Mathematics and logic * Ternary numeral system, a base-3 counting system ** Balanced ternary, a positional numeral system, useful ...
*
Treap In computer science, the treap and the randomized binary search tree are two closely related forms of binary search tree data structures that maintain a dynamic set of ordered keys and allow binary searches among the keys. After any sequence of in ...
*
Weak heap In computer science, a weak heap is a data structure for priority queues, combining features of the binary heap and binomial heap. It can be stored in an array as an implicit binary tree like a binary heap, and has the efficiency guarantees of ...


Comparison of theoretic bounds for variants


Applications

The heap data structure has many applications. *
Heapsort In computer science, heapsort is a comparison-based sorting algorithm. Heapsort can be thought of as an improved selection sort: like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the ...
: One of the best sorting methods being in-place and with no quadratic worst-case scenarios. *
Selection algorithm In computer science, a selection algorithm is an algorithm for finding the ''k''th smallest number in a list or array; such a number is called the ''k''th ''order statistic''. This includes the cases of finding the minimum, maximum, and median e ...
s: A heap allows access to the min or max element in constant time, and other selections (such as median or kth-element) can be done in sub-linear time on data that is in a heap. *
Graph algorithms The following is a list of well-known algorithms along with one-line descriptions for each. Automated planning Combinatorial algorithms General combinatorial algorithms * Brent's algorithm: finds a cycle in function value iterations using on ...
: By using heaps as internal traversal data structures, run time will be reduced by polynomial order. Examples of such problems are Prim's minimal-spanning-tree algorithm and Dijkstra's shortest-path algorithm. * Priority Queue: A priority queue is an abstract concept like "a list" or "a map"; just as a list can be implemented with a linked list or an array, a priority queue can be implemented with a heap or a variety of other methods. *
K-way merge Merge algorithms are a family of algorithms that take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order. These algorithms are used as subroutines in various sorting ...
: A heap data structure is useful to merge many already-sorted input streams into a single sorted output stream. Examples of the need for merging include external sorting and streaming results from distributed data such as a log structured merge tree. The inner loop is obtaining the min element, replacing with the next element for the corresponding input stream, then doing a sift-down heap operation. (Alternatively the replace function.) (Using extract-max and insert functions of a priority queue are much less efficient.) *
Order statistics In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference. Impor ...
: The Heap data structure can be used to efficiently find the kth smallest (or largest) element in an array.


Programming language implementations

* The
C++ Standard Library The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. ISO/IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the original ANSI C standard, it was ...
provides the , and algorithms for heaps (usually implemented as binary heaps), which operate on arbitrary random access
iterator In computer programming, an iterator is an object that enables a programmer to traverse a container, particularly lists. Various types of iterators are often provided via a container's interface. Though the interface and semantics of a given iterat ...
s. It treats the iterators as a reference to an array, and uses the array-to-heap conversion. It also provides the container adaptor , which wraps these facilities in a container-like class. However, there is no standard support for the replace, sift-up/sift-down, or decrease/increase-key operations. * The
Boost C++ libraries Boost, boosted or boosting may refer to: Science, technology and mathematics * Boost, positive manifold pressure in turbocharged engines * Boost (C++ libraries), a set of free peer-reviewed portable C++ libraries * Boost (material), a material b ...
include a heaps library. Unlike the STL, it supports decrease and increase operations, and supports additional types of heap: specifically, it supports ''d''-ary, binomial, Fibonacci, pairing and skew heaps. * There is
generic heap implementation
for C and
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
with D-ary heap and B-heap support. It provides an STL-like API. * The standard library of the
D programming language D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineer ...
include

which is implemented in terms of D'
ranges
Instances can be constructed from an

exposes a

that allows iteration with D's built-in statements and integration with the range-based API of th

* For
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
there is th

module. * The
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
platform (since version 1.5) provides a binary heap implementation with the class in the
Java Collections Framework The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures. Although referred to as a framework, it works in a manner of a library. The collections framework provides both interf ...
. This class implements by default a min-heap; to implement a max-heap, programmer should write a custom comparator. There is no support for the replace, sift-up/sift-down, or decrease/increase-key operations. *
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
has

module that implements a priority queue using a binary heap. The library exposes a heapreplace function to support k-way merging. *
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. ...
has both max-heap () and min-heap () as of version 5.3 in the Standard PHP Library. *
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
has implementations of binary, binomial, and Fibonacci heaps in th

distribution available on
CPAN The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. ''CPAN'' can denote eit ...
. * The Go language contains

package with heap algorithms that operate on an arbitrary type that satisfies a given interface. That package does not support the replace, sift-up/sift-down, or decrease/increase-key operations. * Apple's
Core Foundation Core Foundation (also called CF) is a C application programming interface (API) written by Apple for its operating systems, and is a mix of low-level routines and wrapper functions. Most Core Foundation routines follow a certain naming conventi ...
library contains

structure. *
Pharo Pharo is an open source, cross-platform implementation of the classic Smalltalk-80 programming language and runtime. It's based on the OpenSmalltalk virtual machine called Cog (VM), which evaluates a dynamic, reflective, and object-orient ...
has an implementation of a heap in the Collections-Sequenceable package along with a set of test cases. A heap is used in the implementation of the timer event loop. * The
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH ...
programming language has a binary max-heap implementation

in the module of its standard library. * .NET ha
PriorityQueue
class which uses quarternary (d-ary) min-heap implementation. It is available from .NET 6.


See also

*
Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is important ...
*
Search data structure In computer science, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a database. The simplest, most general, and least efficient search struc ...
* Stack (abstract data type) *
Queue (abstract data type) In computer science, a queue is a collection (abstract data type), collection of entities that are maintained in a sequence and can be modified by the addition of entities at one end of the sequence and the removal of entities from the other end ...
*
Tree (data structure) In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be c ...
*
Treap In computer science, the treap and the randomized binary search tree are two closely related forms of binary search tree data structures that maintain a dynamic set of ordered keys and allow binary searches among the keys. After any sequence of in ...
, a form of binary search tree based on heap-ordered trees


References


External links


Heap
at Wolfram MathWorld

of how the basic heap algorithms work * {{DEFAULTSORT:Heap (Data Structure)